Extracting and Visualizing Stock Data¶

In [3]:
!pip install yfinance==0.2.38
!pip install pandas==2.2.2
!pip install nbformat
Requirement already satisfied: yfinance==0.2.38 in c:\users\navya\anaconda3\lib\site-packages (0.2.38)
Requirement already satisfied: beautifulsoup4>=4.11.1 in c:\users\navya\anaconda3\lib\site-packages (from yfinance==0.2.38) (4.11.1)
Requirement already satisfied: pandas>=1.3.0 in c:\users\navya\anaconda3\lib\site-packages (from yfinance==0.2.38) (2.2.2)
Requirement already satisfied: peewee>=3.16.2 in c:\users\navya\anaconda3\lib\site-packages (from yfinance==0.2.38) (3.17.5)
Requirement already satisfied: requests>=2.31 in c:\users\navya\anaconda3\lib\site-packages (from yfinance==0.2.38) (2.31.0)
Requirement already satisfied: multitasking>=0.0.7 in c:\users\navya\anaconda3\lib\site-packages (from yfinance==0.2.38) (0.0.11)
Requirement already satisfied: pytz>=2022.5 in c:\users\navya\anaconda3\lib\site-packages (from yfinance==0.2.38) (2024.1)
Requirement already satisfied: lxml>=4.9.1 in c:\users\navya\anaconda3\lib\site-packages (from yfinance==0.2.38) (5.2.2)
Requirement already satisfied: appdirs>=1.4.4 in c:\users\navya\anaconda3\lib\site-packages (from yfinance==0.2.38) (1.4.4)
Requirement already satisfied: frozendict>=2.3.4 in c:\users\navya\anaconda3\lib\site-packages (from yfinance==0.2.38) (2.4.4)
Requirement already satisfied: numpy>=1.16.5 in c:\users\navya\anaconda3\lib\site-packages (from yfinance==0.2.38) (1.26.4)
Requirement already satisfied: html5lib>=1.1 in c:\users\navya\anaconda3\lib\site-packages (from yfinance==0.2.38) (1.1)
Requirement already satisfied: soupsieve>1.2 in c:\users\navya\anaconda3\lib\site-packages (from beautifulsoup4>=4.11.1->yfinance==0.2.38) (2.3.1)
Requirement already satisfied: webencodings in c:\users\navya\anaconda3\lib\site-packages (from html5lib>=1.1->yfinance==0.2.38) (0.5.1)
Requirement already satisfied: six>=1.9 in c:\users\navya\anaconda3\lib\site-packages (from html5lib>=1.1->yfinance==0.2.38) (1.16.0)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\navya\anaconda3\lib\site-packages (from pandas>=1.3.0->yfinance==0.2.38) (2.8.2)
Requirement already satisfied: tzdata>=2022.7 in c:\users\navya\anaconda3\lib\site-packages (from pandas>=1.3.0->yfinance==0.2.38) (2024.1)
Requirement already satisfied: idna<4,>=2.5 in c:\users\navya\anaconda3\lib\site-packages (from requests>=2.31->yfinance==0.2.38) (3.3)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\navya\anaconda3\lib\site-packages (from requests>=2.31->yfinance==0.2.38) (1.26.9)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\navya\anaconda3\lib\site-packages (from requests>=2.31->yfinance==0.2.38) (2.0.4)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\navya\anaconda3\lib\site-packages (from requests>=2.31->yfinance==0.2.38) (2021.10.8)
Requirement already satisfied: pandas==2.2.2 in c:\users\navya\anaconda3\lib\site-packages (2.2.2)
Requirement already satisfied: numpy>=1.22.4 in c:\users\navya\anaconda3\lib\site-packages (from pandas==2.2.2) (1.26.4)
Requirement already satisfied: tzdata>=2022.7 in c:\users\navya\anaconda3\lib\site-packages (from pandas==2.2.2) (2024.1)
Requirement already satisfied: pytz>=2020.1 in c:\users\navya\anaconda3\lib\site-packages (from pandas==2.2.2) (2024.1)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\navya\anaconda3\lib\site-packages (from pandas==2.2.2) (2.8.2)
Requirement already satisfied: six>=1.5 in c:\users\navya\anaconda3\lib\site-packages (from python-dateutil>=2.8.2->pandas==2.2.2) (1.16.0)
Requirement already satisfied: nbformat in c:\users\navya\anaconda3\lib\site-packages (5.3.0)
Requirement already satisfied: jupyter-core in c:\users\navya\anaconda3\lib\site-packages (from nbformat) (4.9.2)
Requirement already satisfied: jsonschema>=2.6 in c:\users\navya\anaconda3\lib\site-packages (from nbformat) (4.4.0)
Requirement already satisfied: traitlets>=4.1 in c:\users\navya\anaconda3\lib\site-packages (from nbformat) (5.1.1)
Requirement already satisfied: fastjsonschema in c:\users\navya\anaconda3\lib\site-packages (from nbformat) (2.15.1)
Requirement already satisfied: pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 in c:\users\navya\anaconda3\lib\site-packages (from jsonschema>=2.6->nbformat) (0.18.0)
Requirement already satisfied: attrs>=17.4.0 in c:\users\navya\anaconda3\lib\site-packages (from jsonschema>=2.6->nbformat) (21.4.0)
Requirement already satisfied: pywin32>=1.0 in c:\users\navya\anaconda3\lib\site-packages (from jupyter-core->nbformat) (302)
In [4]:
import yfinance as yf
import pandas as pd
import requests
from bs4 import BeautifulSoup
import plotly.graph_objects as go
from plotly.subplots import make_subplots
C:\Users\navya\anaconda3\lib\site-packages\pandas\core\computation\expressions.py:21: UserWarning: Pandas requires version '2.8.4' or newer of 'numexpr' (version '2.8.1' currently installed).
  from pandas.core.computation.check import NUMEXPR_INSTALLED
C:\Users\navya\anaconda3\lib\site-packages\pandas\core\arrays\masked.py:60: UserWarning: Pandas requires version '1.3.6' or newer of 'bottleneck' (version '1.3.4' currently installed).
  from pandas.core import (
In [5]:
import warnings
# Ignore all warnings
warnings.filterwarnings("ignore", category=FutureWarning)
In [6]:
def make_graph(stock_data, revenue_data, stock):
    fig = make_subplots(rows=2, cols=1, shared_xaxes=True, subplot_titles=("Historical Share Price", "Historical Revenue"), vertical_spacing = .3)
    stock_data_specific = stock_data[stock_data.Date <= '2021--06-14']
    revenue_data_specific = revenue_data[revenue_data.Date <= '2021-04-30']
    fig.add_trace(go.Scatter(x=pd.to_datetime(stock_data_specific.Date, infer_datetime_format=True), y=stock_data_specific.Close.astype("float"), name="Share Price"), row=1, col=1)
    fig.add_trace(go.Scatter(x=pd.to_datetime(revenue_data_specific.Date, infer_datetime_format=True), y=revenue_data_specific.Revenue.astype("float"), name="Revenue"), row=2, col=1)
    fig.update_xaxes(title_text="Date", row=1, col=1)
    fig.update_xaxes(title_text="Date", row=2, col=1)
    fig.update_yaxes(title_text="Price ($US)", row=1, col=1)
    fig.update_yaxes(title_text="Revenue ($US Millions)", row=2, col=1)
    fig.update_layout(showlegend=False,
    height=900,
    title=stock,
    xaxis_rangeslider_visible=True)
    fig.show()

Question 1: Use yfinance to Extract Stock Data¶

In [7]:
tsla=yf.Ticker("TSLA")
In [8]:
tesla_data=tsla.history(period="max")
In [9]:
tesla_data.reset_index(inplace=True)
tesla_data.head()
Out[9]:
Date Open High Low Close Volume Dividends Stock Splits
0 2010-06-29 00:00:00-04:00 1.266667 1.666667 1.169333 1.592667 281494500 0.0 0.0
1 2010-06-30 00:00:00-04:00 1.719333 2.028000 1.553333 1.588667 257806500 0.0 0.0
2 2010-07-01 00:00:00-04:00 1.666667 1.728000 1.351333 1.464000 123282000 0.0 0.0
3 2010-07-02 00:00:00-04:00 1.533333 1.540000 1.247333 1.280000 77097000 0.0 0.0
4 2010-07-06 00:00:00-04:00 1.333333 1.333333 1.055333 1.074000 103003500 0.0 0.0

Question 2: Use Webscraping to Extract Tesla Revenue Data¶

In [10]:
url="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork/labs/project/revenue.htm "
html_data=requests.get(url).text
In [11]:
soup = BeautifulSoup(html_data, "html.parser")
C:\Users\navya\anaconda3\lib\site-packages\bs4\builder\__init__.py:545: XMLParsedAsHTMLWarning: It looks like you're parsing an XML document using an HTML parser. If this really is an HTML document (maybe it's XHTML?), you can ignore or filter this warning. If it's XML, you should know that using an XML parser will be more reliable. To parse this document as XML, make sure you have the lxml package installed, and pass the keyword argument `features="xml"` into the BeautifulSoup constructor.
  warnings.warn(
In [19]:
tesla_revenue = pd.read_html(url, header=0)[1]
tesla_revenue.rename(columns={"Tesla Quarterly Revenue (Millions of US $)": "Date", "Tesla Quarterly Revenue (Millions of US $).1":"Revenue"}, inplace=True)
tesla_revenue.head()
Out[19]:
Date Revenue
0 2022-09-30 $21,454
1 2022-06-30 $16,934
2 2022-03-31 $18,756
3 2021-12-31 $17,719
4 2021-09-30 $13,757
In [34]:
tesla_revenue["Revenue"] = tesla_revenue['Revenue'].replace(',|\$',"",regex=True)
In [35]:
tesla_revenue.dropna(inplace=True)
tesla_revenue = tesla_revenue[tesla_revenue['Revenue'] != ""]
In [36]:
tesla_revenue.tail(5)
Out[36]:
Date Revenue
48 2010-09-30 31
49 2010-06-30 28
50 2010-03-31 21
52 2009-09-30 46
53 2009-06-30 27

Question 3: Use yfinance to Extract Stock Data¶

In [28]:
gamestop=yf.Ticker("GME")
In [29]:
gme_data = gamestop.history(period="max")
In [30]:
gme_data.reset_index(inplace=True)
gme_data.head()
Out[30]:
Date Open High Low Close Volume Dividends Stock Splits
0 2002-02-13 00:00:00-05:00 1.620129 1.693350 1.603296 1.691667 76216000 0.0 0.0
1 2002-02-14 00:00:00-05:00 1.712708 1.716074 1.670626 1.683251 11021600 0.0 0.0
2 2002-02-15 00:00:00-05:00 1.683250 1.687458 1.658001 1.674834 8389600 0.0 0.0
3 2002-02-19 00:00:00-05:00 1.666418 1.666418 1.578047 1.607504 7410400 0.0 0.0
4 2002-02-20 00:00:00-05:00 1.615920 1.662210 1.603296 1.662210 6892800 0.0 0.0

Question 4: Use Webscraping to Extract GME Revenue Data¶

In [31]:
url="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork/labs/project/stock.html"
html_data=requests.get(url).text
In [32]:
soup = BeautifulSoup(html_data, "html.parser")
In [33]:
gme_revenue_df = pd.read_html(html_data)
table = soup.find_all('tbody')[1]
dates = []
revenues = []
for row in table.find_all('tr'):
    cols = row.find_all('td')
    if len(cols) == 2:
        date = cols[0].text.strip()
        revenue = cols[1].text.strip()
        dates.append(date)
        revenues.append(revenue)
gme_revenue = pd.DataFrame({
    "Date": dates,
    "Revenue": revenues
})
gme_revenue['Revenue'] = gme_revenue['Revenue'].replace({'\$': '', ',': ''}, regex=True)
In [37]:
gme_revenue.tail()
Out[37]:
Date Revenue
57 2006-01-31 1667
58 2005-10-31 534
59 2005-07-31 416
60 2005-04-30 475
61 2005-01-31 709

Question 5: Plot Tesla Stock Graph¶

In [38]:
make_graph(tesla_data,tesla_revenue,'Tesla Shares & Revenue')
C:\Users\navya\AppData\Local\Temp\ipykernel_28640\2068038883.py:5: UserWarning:

The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.

C:\Users\navya\AppData\Local\Temp\ipykernel_28640\2068038883.py:6: UserWarning:

The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.

Question 6: Plot GameStop Stock Graph¶

In [39]:
make_graph(gme_data, gme_revenue, 'GameStop Shares & Revenue')
C:\Users\navya\AppData\Local\Temp\ipykernel_28640\2068038883.py:5: UserWarning:

The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.

C:\Users\navya\AppData\Local\Temp\ipykernel_28640\2068038883.py:6: UserWarning:

The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.

In [ ]: